Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds

نویسنده

  • Marek Petrik
چکیده

Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods— distributionally robust ADP—that address the curse of dimensionality by minimizing a pessimistic bound on the policy loss. This approach turns ADP into an optimization problem, for which we derive new mathematical program formulations and analyze its properties. DRADP improves on the theoretical guarantees of existing ADP methods—it guarantees convergence and L1 norm-based error bounds. The empirical evaluation of DRADP shows that the theoretical guarantees translate well into good performance on benchmark problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Practically Efficient Approach for Solving Adaptive Distributionally Robust Linear Optimization Problems

We develop a modular and tractable framework for solving an adaptive distributionally robust linear optimization problem, where we minimize the worst-case expected cost over an ambiguity set of probability distributions. The adaptive distrbutaionally robust optimization framework caters for dynamic decision making, where decisions can adapt to the uncertain outcomes as they unfold in stages. Fo...

متن کامل

Distributionally Robust Logistic Regression

This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown datagenerating distribution with high confidence. We then formulat...

متن کامل

Distributionally Robust Stochastic Knapsack Problem

This paper considers a distributionally robust version of a quadratic knapsack problem. In this model, a subsets of items is selected to maximizes the total profit while requiring that a set of knapsack constraints be satisfied with high probability. In contrast to the stochastic programming version of this problem, we assume that only part of information on random data is known, i.e., the firs...

متن کامل

Distributionally robust stochastic shortest path problem

This paper considers a stochastic version of the shortest path problem, the Distributionally Robust Stochastic Shortest Path Problem(DRSSPP) on directed graphs. In this model, the arc costs are deterministic, while each arc has a random delay. The mean vector and the second-moment matrix of the uncertain data are assumed known, but the exact information of the distribution is unknown. A penalty...

متن کامل

A Cutting Surface Algorithm for Semi-Infinite Convex Programming with an Application to Moment Robust Optimization

We first present and analyze a central cutting surface algorithm for general semi-infinite convex optimization problems, and use it to develop an algorithm for distributionally robust optimization problems in which the uncertainty set consists of probability distributions with given bounds on their moments. The cutting surface algorithm is also applicable to problems with non-differentiable sem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1205.1782  شماره 

صفحات  -

تاریخ انتشار 2012